Spectral biclustering of microarray cancer data: co-clustering genes and conditions
نویسندگان
چکیده
Global analyses of RNA expression levels are useful for classifying genes and overall phenotypes. Often these classification problems are linked, and one wants to simultaneously find "marker genes" that are differentially expressed in particular “conditions”. We have developed a method that simultaneously clusters genes and conditions, finding distinctive "checkerboard" patterns in matrices of gene expression data, if they exist. In a cancer context, these checkerboards correspond to genes that are markedly up or down regulated in patients with particular types of tumors. Our method, spectral biclustering, is based on the observation that checkerboard structures in matrices of expression data can be found in eigenvectors corresponding to characteristic expression patterns across genes or conditions. Furthermore, these eigenvectors can be readily identified by commonly used linear-algebra approaches, in particular the singular value decomposition (SVD), coupled with closely integrated normalization steps. We present a number of variants of the approach, depending on whether the normalization over genes and conditions is done independently or in a coupled fashion. We then apply spectral biclustering to a selection of publicly available cancer expression data sets, and examine the degree to which it is able to identify checkerboard structures. Furthermore, we compare the performance of our biclustering methods against a number of reasonable benchmarks (e.g. direct application of SVD or normalized cuts to raw data).
منابع مشابه
به کارگیری خوشهبندی دوبعدی با روش «زیرماتریسهای با میانگین- درایههای بزرگ» در دادههای بیان ژنی حاصل از ریزآرایههای DNA
Background and Objective: In recent years, DNA microarray technology has become a central tool in genomic research. Using this technology, which made it possible to simultaneously analyze expression levels for thousands of genes under different conditions, massive amounts of information will be obtained. While traditional clustering methods, such as hierarchical and K-means clustering have been...
متن کاملGEMS: a web server for biclustering analysis of expression data
The advent of microarray technology has revolutionized the search for genes that are differentially expressed across a range of cell types or experimental conditions. Traditional clustering methods, such as hierarchical clustering, are often difficult to deploy effectively since genes rarely exhibit similar expression pattern across a wide range of conditions. Biclustering of gene expression da...
متن کاملA New Survey on Biclustering of Microarray Data
There are subsets of genes that have similar behavior under subsets of conditions, so we say that they coexpress, but behave independently under other subsets of conditions. Discovering such coexpressions can be helpful to uncover genomic knowledge such as gene networks or gene interactions. That is why, it is of utmost importance to make a simultaneous clustering of genes and conditions to ide...
متن کاملA Comparative Study of Clustering and Biclustering of Microarray Data
There are subsets of genes that have similar behavior under subsets of conditions, so we say that they coexpress, but behave independently under other subsets of conditions. Discovering such coexpressions can be helpful to uncover genomic knowledge such as gene networks or gene interactions. That is why, it is of utmost importance to make a simultaneous clustering of genes and conditions to ide...
متن کاملBiCross : A Biclustering Technique for Gene Expression Data using One Layer Fixed Weighted Bipartite Graph Crossing Minimization
Biclustering has become an important data mining technique for microarray gene expression analysis and profiling, as it provides a local view of the hidden relationships in data, unlike a global view provided by conventional clustering techniques. This technique, in contrast to the conventional clustering techniques, helps in identifying a subset of the genes and a subset of the experimental co...
متن کامل